- Title
- Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States
- Creator
- Banerjee, Chayan; Chen, Zhiyong; Noman, Nasimul
- Relation
- 2023 62nd IEEE Conference on Decision and Control (CDC). Proceedings of 2023 62nd IEEE Conference on Decision and Control (CDC) (Singapore 13-15 December, 2023) p. 7009-7014
- Publisher Link
- http://dx.doi.org/10.1109/CDC49753.2023.10383350
- Publisher
- Institute of Electrical and Electronics Engineers (IEEE)
- Resource Type
- conference paper
- Date
- 2023
- Description
- Improvement of exploration and exploitation using more efficient samples is a critical issue in reinforcement learning algorithms. A basic strategy of a learning algorithm is to facilitate indiscriminate exploration of the entire environment state space, as well as to encourage exploration of rarely visited states rather than frequently visited ones. Under this strategy, we propose a new method to boost exploration through an intrinsic reward, based on the measurement of a state's novelty and the associated benefit of exploring the state, collectively called plausible novelty. By incentivizing exploration of plausible novel states, an actor-critic (AC) algorithm can improve its sample efficiency and, consequently, its training performance. The new method is verified through extensive simulations of continuous control tasks in MuJoCo environments, using a variety of prominent off-policy AC algorithms.
- Subject
- training; reinforcement learning; boosting; stability analysis
- Identifier
- http://hdl.handle.net/1959.13/1502678
- Identifier
- uon:55265
- Identifier
- ISBN:9798350301243
- Identifier
- ISSN:2576-2370
- Language
- eng
- Reviewed
- Hits: 336
- Visitors: 334
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|